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ABSTRACT 

This  note  is  a  tutorial  in  matrix  manipulation  and  the  normal  distribution  of 
statistics,  concepts  that  are  important  for  deriving  and  analysing  the  Kalman 
Filter,  a  basic  tool  of  signal  processing.  We  focus  on  the  proof  of  the  well-known 
fact  that  the  sum  of  two  n-dimensional  normal  probability  density  functions 
is  also  normal.  While  this  theorem  is  usually  taken  for  granted  in  the  signal 
processing  field,  proving  it  provides  an  insightful  excursion  into  techniques  such 
as  Gaussian  integrals  and  the  Matrix  Inversion  Lemma. 
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Proofs  and  Techniques  Useful  for  Deriving  the  Kalman 

Filter 


EXECUTIVE  SUMMARY 

Much  analysis  in  the  field  of  tracking  and  signal  processing  involves  many  parameters. 
One  important  example  is  the  Kalman  Filter,  an  algorithm  that  updates  the  estimated 
values  of  parameters  based  on  their  previous  estimated  values  and  a  set  of  observations. 

Parameters  such  as  these  are  usually  best  arranged  in  a  vector  for  economy  of  language. 
Any  linearity  inherent  in  the  technique  being  described  can  then  be  expressed  using  matrix 
language.  Deriving  and  analysing  the  Kalman  Filter  is  one  such  example  of  this,  so  that  a 
good  command  of  matrix  manipulation  becomes  useful  to  the  field.  For  example,  matrices 
and  vectors  are  used  to  manipulate  the  normal  probability  density  functions  used  in  the 
Kalman  Filter. 

In  this  note,  we  have  used  some  of  these  techniques  to  prove  the  well-known  fact  that 
the  sum  of  two  n-dimensional  normal  density  functions  is  also  normal.  While  this  theorem 
is  usually  taken  for  granted  in  the  signal  processing  field,  proving  it  is  an  insightful  exercise 
in  applying  some  useful  matrix  techniques,  such  as  Gaussian  integrals  and  the  Matrix 
Inversion  Lemma. 
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1  Getting  Started:  the  Proof  for 
One-Dimensional  Variables 


Much  analysis  in  the  field  of  tracking  and  signal  processing  involves  many  parameters. 
One  important  example  is  the  Kalman  Filter,  an  algorithm  that  updates  the  estimated 
values  of  parameters  based  on  their  previous  estimated  values  and  a  set  of  observations. 

Parameters  such  as  these  are  usually  best  arranged  in  a  vector  for  economy  of  language. 
Any  linearity  inherent  in  the  technique  being  described  can  then  be  expressed  using  matrix 
language.  Deriving  and  analysing  the  Kalman  Filter  is  one  such  example  of  this,  so  that  a 
good  command  of  matrix  manipulation  becomes  useful  to  the  field.  For  example,  matrices 
and  vectors  are  used  to  manipulate  the  normal  probability  density  functions  used  in  the 
Kalman  Filter. 

In  this  note,  we  have  used  some  of  these  techniques  to  prove  the  well-known  fact  that 
the  sum  of  two  ra-dimensional  normal  density  functions  is  also  normal.  While  this  theorem 
is  usually  taken  for  granted  in  the  signal  processing  field,  proving  it  is  an  insightful  exercise 
in  applying  some  useful  matrix  techniques,  such  as  Gaussian  integrals  and  the  Matrix 
Inversion  Lemma. 

We  begin  by  stating  the  theorem  to  be  proved:  that  the  sum  of  two  Gaussian  density 
functions  is  another  Gaussian  function.  Its  mean  is  the  sum  of  the  individual  means, 
and  its  variance  (or  covariance  in  the  n-dimensional  case)  is  the  sum  of  the  individual 
variances  (or  covariances).  The  proof  of  this  fact  uses  some  techniques  and  results  that 
are  useful  knowledge  for  anyone  undertaking  analytical  work  in  the  field  of  tracking.  These 
techniques  and  proofs  are,  in  fact,  not  easy  to  locate  in  the  literature,  and  so  we  present 
them  here.  We  have  not  aimed  for  any  extreme  economy  in  how  the  process  has  been 
carried  out.  Rather,  the  calculation  is  done  from  a  first-principles  point  of  view,  precisely 
because  of  its  effectiveness  as  an  exercise  in  matrix  manipulation. 

Two  results  that  are  needed  are  given  in  the  appendices.  The  first  is  the  result  of  an 
n-dimensional  integration  of  a  Gaussian  function.  The  second  appendix  gives  a  conve¬ 
niently  short  form  of  the  very  useful  Matrix  Inversion  Lemma,  from  which  all  other  forms 
of  that  lemma  can  be  derived  in  a  straightforward  way  (as  demonstrated  by  an  example 
in  that  appendix). 

The  sum-of-Gaussians  result  is  first  proved  here  in  one  dimension,  to  give  a  feel  for  the 
approach  to  be  followed  in  the  n-dimensional  case.  Consider  two  random  variables 


x  ~  A f(x,  cr^)  and  V  ~  AA(y,  <jy ) ,  with  z  =  x  +  y 


(1.1) 


by  which  we  mean  there  are  two  Gaussian  functions  being  considered: 


x  —  x 


2 


(1.2) 


The  task  is  to  compute  the  sum  density,  p(z).  If  x,y  are  independent,  then  the  probabil¬ 
ity  p(z)  dz  that  2  is  found  in  some  interval  [z,  z  +  d z]  equals  the  product  of  the  probabil¬ 
ities  that  x  is  found  in  the  interval  \x,  x  +  dx],  and  y  is  found  in  a  corresponding  interval 
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constrained  to  ensure  that  y  =  z  —  x: 

p(z)  d z  = 

Here  are  two  different  ways  to  analyse  this  integral. 


/  Px(x)dx  py(y)dy\  _x 

J  X 


(1.3) 


First  way:  change  of  variables  Consider  new  variables  X,  z,  functions  of  x,  y  defined  via 

x  =  X,  y  =  z  —  X  .  (1.4) 

Changing  variables  in  (1.3)  gives  [1] 

d(x,y) 


p(z)dz  =  JxPx(X)  py(z  -  X) 


d(X,z ) 


dXdz 


=  fxpx(X">  Py(z~X)  dXdz> 


since 


d(x,y) 

d(X,z) 


1  0 

-1  1 


=  1 


(1.5) 


But  X  is  now  just  a  dummy  variable  of  integration,  so  change  it  to  x  to  give  the 
required  expression: 

P(z)  =  J Px(x)  Py ("  ~x)dx.  (1.6) 


Second  way:  graphical  viewpoint  Alternatively,  refer  to  Fig.  1,  the  blue  region  of 
which  shows  the  points  (x,  y)  such  that  y  is  constrained  to  an  infinitesimal  region 
around  y  =  z  —  x,  and  z  lies  in  [z,  z  +  dz].  The  area  of  the  shaded  tile  is  dx  d y.  But 
this  area  is  also  dxdz.  Thus  (1.3)  becomes 


in  which  case 


agreeing  with  (1.6). 


x)  dx  d£ , 

(1.7) 

x)  dx , 

(1.8) 

Equation  (1.6)  is  a  convolution  integral,  and  relates  the  technique  of  convolution  to 
a  summing  of  random  variables.  Using  it,  we  are  able  to  construct  p(z)  given  the  two 
functions  in  (1.2): 


1  f . 

—  (x  —  x)2 

(z  -  x  -  y)2' 

Q  /  CXP 

&X  O'yZTT  J 

[  2^2 

H  \ 

The  brackets  of  (1.9)  expand  to 

2  /  1  1  \  (  x  z-y\  x2  ( z-y )2 

XWx  +  2  o*)+X{o*+o*)  2a l  2al 


(1.9) 


(1.10) 


which,  being  a  quadratic  in  x,  allows  the  integral  (1.9)  to  be  done: 


p(z) 


„-a2 


(z  ~  y) 

2°l 


(l.n) 
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Figure  1:  A  graphical  depiction  of  the  change  of  variables  in  (1.7) 

The  brackets  of  (1.11)  simplify  considerably.  Equation  (1.11)  can  be  written  more  suc¬ 
cinctly  by  defining  a  =  yj '  cr2jr  a2,  producing 


1  ~[z-(x  +  y)}2 

p[z)  =  — j=  exp 


a 


2  u2 


(1.12) 


But  this  is  a  Gaussian  function  with  mean  x  +  y  and  variance  a2  +  a2.  That  is, 


x  +  y  ~  N(x  +  y,  a2x  +  (T2) 


(1.13) 


as  was  required  to  be  proved. 


2  The  Proof  for  n-Dimensional  Variables 


The  proof  that  the  sum  of  two  n-dimensional  Gaussians  gives  another  Gaussian  follows 
the  same  line  of  reasoning  as  in  the  1-dimensional  case,  but  is  more  involved  owing  to  the 
many  matrix  manipulations  required. 

Begin  with  two  n-dimensional  Gaussian  variables  (all  vectors  are  columns  in  what 
follows) : 

x  =  [x1...xn]t ,  y  =  [y1...yj  ,  (2.1) 

with 

x  ~  A f(x,  Px)  and  y  ~  A f(y,  Py) .  (2.2) 
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Their  density  functions  are  extensions  of  (1.2): 

=  (27r)n/21|Pj|l/2  eXP  ~Y^X~  X)tp*\X  ~  X)  > 

PyM  =  {27T)n/l \Py\i/2  exP -y (y - y)tpy^y ~ v) •  (2-3) 

Define  2  =  x  +  y.  We  wish  to  show  that  z  is  normally  distributed  with  mean  x  +  y  and 
covariance  Px  +  Py. 

The  proof  begins  by  creating  a  convolution  integral,  just  as  in  the  1-dimensional  case. 
To  see  how  it  comes  about,  consider  (for  brevity)  the  case  of  n  =  2  dimensions.  Just  as 
d y  =  d z  in  the  1-dimensional  case  of  Fig.  1,  so  also  in  the  2-dimensional  case  we  have 


p(z )  d z1  dz2  =  probability  that  x1  E  [x1;  x1  +  da;1]  and  x2  E  [x2,  x2  +  dx2] 
and  y1  E  [y±,  y1  +  dzx]  and  y2  E  [y2,  y2  +  d z2J 


Px(xi’x2)  PV(Z1  ~xi>  z2  ~x2 )  dxldx2  dzxdz2, 


(2.4) 

(2.5) 


so  that 

p{z)  =  JJ Px(xiix2)  Py{zi  ~  xi,  Z2-  x2)dx1dx2. 

This  is  seen  to  extend  to  n  dimensions,  in  which  case  the  required  integral  is 
1  f  —1 

p(z)  =  | p  P  1 1/2  J  exp  —  [{x-x)tP-L{x-x)+{z-x-y)tp-1(z-x-y)]dx1...dxn 

(2.6) 

The  integration  is  over  x,  so  collecting  terms  in  x  within  the  brackets  in  (2.6)  gives 


p{z)  = 


PXPy\p2  / 


(2vr)n  \PxPy\V2 


exp 


—x 


tP^+p-1 


x  +  ^P^  +  iz-yfp-^x 


7D  —  1 1 

y 


-~xPx  x  —  -(z  —  y)  Py  ( z-y ) 


(2.7) 


dxx  . . .  dxri 


which  integrates  via  (A2)  to  give 

TTn/2 

p{z)  =  - 


(2tt)"  \PxPy\P2 


p-i  ,  p-i 

J  X  '  x  l 


1/2 


exp 


7  (xtpxl  +  (z  -  yf  P~l) 


-\tD— 1\  f  Px  1+Py  1 


-1 


(Px  1x  +  Py1(z-  y)) 


(2.8) 


-  \xtpx lx  -  \(z  -  y)tpv  1(z  -  y) 


Define  a  matrix  P  such  that  P  1  =  P.  1  +  P„  1.  In  that  case 

x  y 


p-i  ,  p-i 

J  x  '  y 

1/2 

p-1 

2 

2 

p-'+Py1' 

-1 

- 

(P-1 

2  y 

\  2 

1/2  |p|— 1/2 

“  2  n/2 


and 


(2.9) 


=  2  P . 
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Thus 

ipl1/2 

p(z )  =  - —  .  , - exp  B  ,  (2.10) 

J  (27r)n/2  (P^Pyl1/2  p  ’  y  J 

where  (setting  a  =  z  —  y) 

2 B  =  {xlP-1  +  cSP-1)  P  (. P~lx  +  P^a)  -  xlP-xx  -  ctP-'a. 

=  x*  {P-'PP-'  -  P-1)  x  +  2xtP~1PP~1a  +  a*  (P^PP”1  -  P y1)  a  .  (2.11) 

There  are  three  expressions  involving  P,Px,Py  in  the  last  line  of  (2.11)  that  need  simpli¬ 
fying.  This  can  be  done  using  the  Matrix  Inversion  Lemma,  explained  in  Appendix  B. 
Hoping  to  prove  that  the  covariance  of  z  is  Px  +  Py,  we  will  aim  to  have  Px  +  Py  appear 
wherever  possible. 


P-XPP-1-P-X  (B4)  -(. PX+Py)-\ 

P^PPy1  =  [PyiP-1  +  P-1)^,]-1  =  ( PX  +  Py , 

P~lPP~l  —  P~ 1  =  —  (Px  +  Py)-1  (from  two  lines  up  with  x  <->•  y).  (2.12) 

Thus 


2 B  =  -x\Px  +  Py)~lx  +  2 x\Px  +  Py)~la.  -  e**(Px  +  Pv)~1ol 
=  -(x  -  a)t(Px  +  P^)_1(*  “  “) 

=  ~lz  -  ( x  +  y)]\Px  +  Py^iz  -  (x  +  y)\ .  (2.13) 


Finally,  (2.10)  can  be  written  as 

1 


-1, 


p(z)  =  - 7 - 7 - 7 - 7 - —  exp  — \z  —  (x  +  y)]Upx  +  Pv)  1\z  —  (x  +  y)} 

y  (27t)”/2  iP^+Py1]1/- 2\PXPy\ 1/2  2  L  V  X  VJ  L  V 


(2^-/2  |P„+p,|i/2 exp  t[2  -  <*  +  y)]\p,+Pv)-^  -  (*  +  s»  • 


(2.14) 


That  is,  2  ~  AA(*  +  y,  Px  +  Py),  as  was  required  to  be  proved. 
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Appendix  A  Calculating  an  n-Dimensional 

Gaussian  Integral 


This  appendix  takes  the  well-known  1-dimensional  result 

(ad 

and  generalises  it  to  the  less  well-known  but  very  useful  n-dimensional  result: 

. -n/2  1 

Ax+h  x  dx-,  . . .  dxn  =  exp  ^A^b ,  (A2) 

where  A  is  a  real  symmetric  nxn  matrix,  and  b,  x  (and  u  in  what  follows)  are  n-dimensional 
column  vectors.  The  “t”  superscript  denotes  the  matrix  transpose. 


First,  note  that  because  A  is  real  and  symmetric,  it  can  be  orthogonally  diagonalised 
to  give  A  =  PA'  Pt  with  P =  Pt  and  A!  diagonal.  Use  this  P  to  create  a  change  of 
variables  to  u,  via  x  =  Pu.  Denote  the  left-hand  side  of  (A2)  by  I,  which  we  must  show 
equals  the  right-hand  side  of  (A2).  The  change  of  variables  converts  the  left-hand  side 
of  (A2)  to  [1] 


f 


1=1  e 

r  —  oo 


u*  A'  u+b*  Pu 

d(xu . 

•  5  Eri) 

d(ui, . 

•  j  ^n) 

dui . . .  d ur 


(A3) 


Since  the  elements  of  P  are  constants,  the  ijth  element  of  the  Jacobian  matrix  is  Pij.  Thus 
the  Jacobian  matrix  is  just  P,  so  that  the  Jacobian  determinant  is 


d{x\i  ■  •  ■  ,xn) 
d(ui,  ...,un) 


=  |P|  G  {±1}  ,  so  that 


d(xi,...,xn) 


d(ui,...,un) 


=  1. 


Now  set  brt  =  6f P,  and  write  /  as 


/=  j  e 

•  —  OO 


f 

n 


-ul  A'  u+b'f  u 


d u1 . . .  d un  = 


exp 


—A'uuf  +  b[ui  d ux . . .  d u, 


/OO  r  ^ 

exp  E  ~AiiUl  +  b'iUi 

-°°  i 

(Al) 


dtq . . .  d uri 


n 


A',  6XP  4Ak 


But  nA'u  =  |A'|  =  \PtAP\  =  \A\,  so 


Also 


b'2 


1/A'n 

0 


Thus  (A6)  becomes 


/  = 

0 

1  /A'nn 

I  = 


W2  l  ^  b 


|A|V2 


/2 


A'-- 


b'  =  brtA'  lb'  =  btPPtA~1PPtb  =  tfA^b .  (A7) 


irn/2  1 


which  is  the  right-hand  side  of  (A2).  QED. 


(A4) 


(A5) 


(A6) 


(A8) 
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Appendix  B  Matrix  Inversion  Lemma 

The  Matrix  Inversion  Lemma  is  often  used  and  very  powerful  in  the  matrix  analysis  of 
signal  processing.  It  comes  in  various  forms,  but  they  are  all  easily  derived  from  the 
following  basic  form  of  the  lemma.  For  any  matrices  A  and  B  not  necessarily  square,  as 
long  as  the  products  AB  and  BA  exist  and  the  relevant  matrices  are  invertible, 

(/  +  AB)-1  =  I  -  A(I  +  BA^B  ,  (Bl) 

where  by  throughout  this  appendix  we  mean  the  identity  matrix  of  size  appropriate 
to  its  use. 

The  lemma  can  be  proved  by  multiplying  the  inverse  of  the  left-hand  side  of  (Bl)  by 
its  right-hand  side  and  inspecting  the  result: 

LHS-1 .  RHS  =  {I  +  AB)  [ I  -  A(I  +  BA)-lB ] 

=  /  +  AB  —  A(I  +  BA^B  -  ABA(I  +  BA)~1B 

=  I  +  A  [I  -  (I  +  BA)-1  -  BA(I  +  BA)-1}  B 

=  I  +  A[I  -  (7  +  BA)(I  +  BAy^B 

=  /.  (B2) 

In  that  case,  LHS  =  RHS,  and  the  lemma  is  proved. 

More  complicated  versions  of  the  lemma  make  good  use  of  the  fact  that  ( PQ )_1  =  Q-1!3”1 
for  any  invertible  matrices  P,  Q.  For  example,  apply  the  lemma  to  (A  +  BCD)-1: 

(A  +  BCD)-1  =  [{I  +  BCD  A'1)  A]  ~l 
=  A~X{I  +  BCD  A-1)-1 

=  A-1  (I  -  BC[I  +  DA^BC^DA-1) 

=  A-1  (/  -B  [(/  +  DA-1BC)C~1}  ^DA'1) 

=  A-1  (I  -HfC"1  +  DA-1  By1  DA-1)  .  (B3) 

Finally, 

{A  +  BCD)-1  =  A-1  -  A~1B(C-1  +  DA~1B)-1DA~1 ,  (B4) 

which  is  a  common  form  of  the  Matrix  Inversion  Lemma. 
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